Model Evaluation Suite

The suite is composed of various checks such as: Metadata Segments Performance, Prediction Drift, Train Test Performance, etc...
Each check may contain conditions (which will result in pass / fail / warning ! / error ) as well as other outputs such as plots or tables.
Suites, checks and conditions can all be modified. Read more about custom suites.


Conditions Summary

Status Check Condition More Info
Train Test Performance Train-Test scores relative degradation is less than 0.1 6 scores failed. Found max degradation of 17.33% for metric Recall and class 1.
Prediction Drift Prediction drift score < 0.15 Found model prediction Kolmogorov-Smirnov drift score of 0.03
Property Segments Performance - Train Dataset The relative performance of weakest segment is greater than 80% of average model performance. WeakSegmentsPerformance was unable to train an error model to find weak segments.Try supplying additional properties.

Check With Conditions Output

Train Test Performance

Summarize given model performance on the train and test datasets based on selected scorers. Read More...

Conditions Summary
Status Condition More Info
Train-Test scores relative degradation is less than 0.1 6 scores failed. Found max degradation of 17.33% for metric Recall and class 1.
Additional Outputs

Go to top

Prediction Drift

Calculate prediction drift between train dataset and test dataset, using statistical measures. Read More...

Conditions Summary
Status Condition More Info
Prediction drift score < 0.15 Found model prediction Kolmogorov-Smirnov drift score of 0.03
Additional Outputs
The Drift score is a measure for the difference between two distributions, in this check - the test and train distributions.
The check shows the drift score and distributions for the predicted class probabilities.
For discrete distribution plots, showing the top 10 categories with largest difference between train and test.

Go to top

Property Segments Performance - Train Dataset

Search for segments with low performance scores. Read More...

Conditions Summary
Status Condition More Info
The relative performance of weakest segment is greater than 80% of average model performance. WeakSegmentsPerformance was unable to train an error model to find weak segments.Try supplying additional properties.
Additional Outputs
WeakSegmentsPerformance was unable to train an error model to find weak segments.Try supplying additional properties.

Go to top

Check Without Conditions Output


Other Checks That Weren't Displayed

Check Reason
Property Segments Performance - Test Dataset `np.NINF` was removed in the NumPy 2.0 release. Use `-np.inf` instead.
Metadata Segments Performance - Train Dataset Functionality requires metadata, but the the TextData object had none. To use this functionality, use the set_metadata method to set your own metadata with a pandas.DataFrame.
Metadata Segments Performance - Test Dataset Functionality requires metadata, but the the TextData object had none. To use this functionality, use the set_metadata method to set your own metadata with a pandas.DataFrame.

Go to top